Speaker Personality Classification Using Systems Based on Acoustic-Lexical Cues and an Optimal Tree-Structured Bayesian Network
نویسندگان
چکیده
Automatic classification of human personality along the Big Five dimensions is an interesting problem with several practical applications. This paper makes some contributions in this regard. First, we propose a few automatically-derived personality-discriminating lexical features which provide information complementary to the conventional acoustic-prosodic cues. We also design a frame-level Gaussian mixture model based system which adds complimentary information to the systems trained on global statistical functionals. Next, we note that the Big Five dimensions are correlated and thus model the dependency between these dimensions in the form of an optimal tree-structured Bayesian network. Our final sub-system consists of within class covariance normalization followed by L1regularized logistic regression. Fusion of all these sub-systems achieves better classification performance than independently trained classifiers using just acoustic features.
منابع مشابه
OPTIMIZATION OF TREE-STRUCTURED GAS DISTRIBUTION NETWORK USING ANT COLONY OPTIMIZATION: A CASE STUDY
An Ant Colony Optimization (ACO) algorithm is proposed for optimal tree-structured natural gas distribution network. Design of pipelines, facilities, and equipment systems are necessary tasks to configure an optimal natural gas network. A mixed integer programming model is formulated to minimize the total cost in the network. The aim is to optimize pipe diameter sizes so that the location-alloc...
متن کاملBroadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features
This paper proposes to integrate multi-modal features using conditional random fields (CRF) for broadcast news story segmentation. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness, acoustic features involve pause duration, pitch, speaker change and audio event type, and visual fea...
متن کاملSpeaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles
Herein we present a comparison of novel concepts for a robust fusion of prosodic and verbal cues in speech emotion recognition. Thereby 276 acoustic features are extracted out of a spoken phrase. For linguistic content analysis we use the Bag-of-Words text representation. This allows for integration of acoustic and linguistic features within one vector prior to a final classification. Extensive...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملLaughter Valence Prediction in Motivational Interviewing Based on Lexical and Acoustic Cues
Motivational Interviewing (MI) is a goal oriented psychotherapy counseling that aims to instill positive change in a client through discussion. Since the discourse is in the form of semi-structured natural conversation, it often involves a variety of non-verbal social and affective behaviors such as laughter. Laughter carries information related to affect, mood and personality and can offer a w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012